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ABSTRACT 

Direct lens modeling is the key to successful and meaningful automated strong galaxy-scale grav- 
itational lens detection. We have implemented a lens-modcling "robot" that treats every bright red 
galaxy (BRG) in a large imaging survey as a potential gravitational lens system. Using a simple 
model optimized for "typical" galaxy-scale gravitational lenses, we maximize the multiply-imaged 
source plane flux and generate, for the resulting best lens model, four assessments of model qual- 
ity that are then used in an automated classification. The robot infers from these four data the lens 
classification parameter H that a human would have assigned; the inference is performed using a prob- 
ability distribution generated from a human-classified training set of candidates, including realistic 
simulated lenses and known false positives drawn from the HST Extended Groth Strip (EGS) survey. 
We compute the expected purity, completeness and rejection rate, and find that these statistics can 
be optimized for a particular application by changing the prior probability distribution for H] this is 
equivalent to defining the robot's "character." Adopting a realistic prior based on expectations for 
the abundance of lenses, we find that a lens sample may be generated that is ~ 100% pure, but only 
~ 20% complete. This shortfall is due primarily to the over-simplicity of the lens model. With a more 
optimistic robot, 90% completeness can be achieved while rejecting ~ 90% of the candidate objects. 
The remaining candidates must be classified by human inspectors. Displaying the images used and 
produced by the robot on a custom "one-click" web interface, we are able to inspect and classify lens 
candidates at a rate of a few seconds per system, suggesting that a future 1000 square degree imaging 
survey containing 10^ BRGs, and some 10^ lenses, could be successfully, and reproducibly. searched in 
a modest amount of time. We have verified our projected survey statistics, albeit at low significance, 
using the HST/EGS data, discovering four new lens candidates in the process. 

Subject headings: gravitational lensing - methods: data analysis - methods: statistical - techniques: 
miscellaneous - galaxies: elliptical - surveys 



1. INTRODUCTION 

Large, well-defined samples of strong gravitational 
lenses enable many important astrophysical and cosmo- 
logical investigations. These include measuring the pro- 
jected dark and luminous mass distributions of galax - 
ies (e.g. lTreu fc Koopmansll2004l : lKoopmans et al.l[2006[ ). 
measuring the expansio n rate of th e univ e rse H(z) wit h 
lens time delays (e.g. ISuvu et all l2008l : lOguril I2007D . 
and also cosiriologica l volumes and distance ratio s (e.g. 
iMitchell et all l2005t ICapelo fc Natarajanl l2007t l. and 
probing the dark galaxy sub s truct u re predicted by CDM 
models (jKochanek fc Dalall |2004| : iBradac et al] |2004[ ). 
Extended lensed source galaxy images provid e more con- 
strain ts on the lens mass distribution (e.g. iKoopmand 
|2005( ) than do lensed point sources. Since lensing con- 
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serves surface brightness, the distant lensed ("source") 
objects are greatly extended and magnified, revealing de- 
tails that may not be otherwise visible. For example, 
recent observations using gravitational lenses as "cosmic 
telescopes" have provide d unprecedented view s of form- 
ing dwarf galaxies (e.g. [Marshall et al.l l2007f). Lyman- 
break galaxies at z = 2 — 4 (see e.g. iBunker 'era[200(Tl : 
ISmail et al.ll2007l: lAUam et al.ll2007f). and quasar accre- 
tion disks (e.g. iPoindexter et al.ll2008l ). Having a large 
array of cosmic telescopes will allow us to select the best 
ones for any given application, but will also allow us to 
build up a statistical picture of the source population. 

Since the first st rong lens was discovered, (Q 0957-1-561, 
IWalsh et al.lll979f l. the ~200 galaxy-scale lenses known 
have been found by a combination of serendipi- 
tous discovery and a host of systematic search tech- 
niques. These techniques include visual inspection 
of deep optical imaging obtained with t he Hubble 
Space Telescope (HST; e g. iHogg et al.lll996t IZepf et all 
1997t iRatnatunga et all Il999t iFassnacht et all l2004l : 



Moustakas et al.ll2007l : lFaure et al.ll2008f ) . targeted imag- 
ing of the popula tion of poten tially l ensed quasars or ra- 
dio sources (e.g. iMvers et al. ' 2003: Bro wne et al.ll2003l : 
llnada et all l2003l : lOguri et al.l 12004 : iPindor et al.ll2006[) . 
and followup of systems for which optical spectroscopy 
revea l ed anomalous emi s sion lines (e . g. [ AVarren et all 
[19961: iBolton et all [2003 : iWiUis et al.l \20m . Further 
techniques using time-domain information have been 
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proposed as efficient lens finders (e.g. iPindorl l2005l : 
iKochanek erani2006l ). 

It is a spectroscopic survey that has generated the 
largest single lens sample to date. With the SDSS galaxy 
redshift survey as its f eeder, the Sloan-Le ns ACS Sur- 
vey (SLAGS) project (jBolton et a ll I2006D has discov- 
ered ~ 70 new galaxy-scale strong lenses (jBolton et al.l 
120081) . Candidates are selected for their background 
galaxy emission lines present in the foreground old stellar 
population spectrum, and then confirmed using high res- 
olution 7/5* T" imaging. The SDSS selection function limits 
the median lens redshift to be « 0.2, while the need for 
detectable emission lines hides many lenses and incurs 
a strong "magnification bias" towards ring-like systems. 
Future surveys will extend the lens redshift limit some- 
what, but likely not reach the same area as SDSS. 

Instead, we anticipate that samples of galaxy-scale 
gravitational lenses that are 2-3 orders of magnitude 
larger than the current set will come, at least before 
SKA, from large optical imaging surveys. For example, 
the proposed Super-Nova Acceleration Probe (SNAP) 
mission is slated to include a 10 00 square degree sur- 
vey at HST/WFPC2 resolution (jAldering et all [200i : 
[Marshall et al.|[2005af ). with multi-band imaging to pro- 
vide photometric redshifts of faint galaxies, and discover 
some 10^ strong lenses. From the ground, KIDS, Pan- 
STARRS, DES and LSST will aU make significant ad- 
vances in strong lensing, but will be limited by their an- 
gular resolution: their strengths will lie in the finding of 
large numbers of time-varying, and/or wide-separation, 
lensed systems. The majority of the strong lensing cross- 
sect ion in the universe lies in massive elliptical galax- 
ies (jTurner et al.|[l98^ . Therefore, in a high-resolution 
space-based survey, the majority of lenses will be mas- 
sive elliptical galaxies with redshifts 0.5-1.0, multiply- 
imaging faint bl ue star- forming galaxie s at redshifts 1.0 
and above (e.g. [Marshall et al.l l2"005aD . This suggests 
that an efficient strategy is to focus on bright red galax- 
ies (BRGs) as being the "typical" pote ntial lenses (e.g. 
iFassnacht eran[200l iFaure et al.|[200l . 

Cost limitations mean we may not be able to perform 
spectroscopy on all future lens candidates, whose sources 
are likely to be quite faint, and we will need to rely on 
a better, more quantitative understanding of the imag- 
ing data in hand. Indeed, with only image information 
available, the classification of any lens candidate must 
come entirely from modeling. Does a model for the im- 
age where some of the features are lensed by a massive 
object (consistent with the observed elliptical galaxy) ex- 
plain the data? Are the residuals from this modeling 
process consistent with what we know about early-type 
galaxy structure? We suggest that the optimal way to 
find lenses in optical imaging surveys is to classify objects 
by their ability to be modeled as gravitational lenses. 

In the imaging survey gravitational lens searches al- 
ready carried out, the lens modeling has been performed 
after a sample of candidates has been generated by 
other means, and classified by experienced human in- 
spectors. Indeed, this human inspection stage can be 
sufficiently effective that it can be thought of as an ap- 
proximate lens-modeling process. Present-day HST sur- 
veys of area ~ 1 square degree are small enough that hu- 
man inspection is tedious but feasible. However, looking 
forward to the next generation of high resolution imaging 



surveys, and desiring to build on the extensive human ex- 
pertise in identifying lenses, we are motivated to develop 
an automated lens-finding "software robot." The tire- 
lessness of this robot would enable it to find lenses with 
a well-understood, calculable, and reproducible selection 
function; this will be a vital property for the resulting 
lens samples to be statistically useful. 

In this work, we describe just such a robot: it models 
every possible candidate massive galaxy image in the sur- 
vey as a combination of foreground (lensing) galaxy and 
multiply- imaged background (lensed) source, and then 
interprets the results based on its previous experience 
with both real and simulated data. We will argue that, 
at least at present, the most effective automated meth- 
ods mimic the operation of a human analyst: our robot 
attempts to predict, via a probabilistic model, the clas- 
sification that would have been made by a human. 

Wc can prepare for a future of much larger surveys 
by working with manageably-sized current samples of 
lens candidates from extant high resolution imaging sur- 
veys. The HST archive contains several square de- 
grees of suitably-surveyed (e.g. deep, high galactic lat- 
itude) sky, which we are searching for serendipitous lens- 
ing ev ents (the HAGGLeS p roject, program HST-AR- 
10678 iMarshaU et"al] l2005bl. MarshaU et al. in prepa- 
ration). As a pilot for this project, we carried out a 
by-eye search for lenses in the 0.17 square degree area 
of the HST/ ACS coverag e of the Extended Groth Strip 
EGS, iDavis et al.] I2007f ). finding three definite lenses 
Moustakas et al.l 120071 . hereafter M07), which includes 
one that was found (also by eye) in the HST Medium 
Deep Survey (jRatnatunga et al1ll999r). and four candi- 
dates of lower believabilitv (jMoustakas eraI1[2006l . here- 
after M06). This dataset therefore makes an excellent 
testing ground for new lens detection methods. 

This paper is organized as follows. In Sections [2] (lens 
modeling) and [3] (probabilistic interpretation) wc outline 
our lens-finding algorithm, and describe its implementa- 
tion as a software robot. In Section [4] we educate the 
robot, using a large set of simulated galaxy-scale lenses, 
and a roughly equal number of human-classified non- 
lenses from the HST EGS survey. As a step towards 
understanding the selection function of galaxy-galaxy 
strong lenses, we assess the completeness and purity of 
the robot-generated sample at this point. Then in Sec- 
tion [5l as a "blind test" we apply our automatic lens 
finder to the remaining EGS images used in M06 and 
M07, and compare with the by-eye search results. Fol- 
lowing some discussion of the strengths and weaknesses 
of our approach in Section [51 wc present our conclusions 
in Section [T] All magnitudes referred to are calculated 
in the AB system. 

2. LENS MODELING 

From a given potential lens galaxy, suitably selected 
(e.g. by color and magnitude) and extracted from the 
survey images, we subtract a smooth, symmetric model 
of its own intensity, leaving a residual map that may 
potentially (and perhaps partially) be explained by mul- 
tiple images of a background source. For each possi- 
ble lens model in a finite, gridded space, we compute 
gravitational lens deflections that arc used to transform 
the observed ( "image-plane" ) pixels to their nearest-pixel 
unlensed ("source-plane") positions. Because the models 
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under consideration are multiply-imaging, more than one 
image-plane pixel may map to each source-plane pixel; 
since gravitational lensing conserves surface brightness, 
the pixels of the image plane mapping to the same point 
in the source plane should have equal values. To opti- 
mize the model we construct a scalar that is maximized 
when non-trivial fractions of image-plane pixels "agree" 
on the intensity to be assigned to these source-plane pix- 
els. The bright and dark parts of the residual maps both 
participate in this agreement; indeed, blank sky in the 
image plane where a second, third, or fourth image ought 
to appear, given structure in the source plane, is more 
damning to a possible lens model than image-plane flux 
where no image is expected, and our scalar metric cap- 
tures this. 

The method is motivated by the observation that when 
multiply- imaged galaxies are faint, the best evidence that 
they are indeed multiply-imaged is that there is a reason- 
able gravitational lens model with a reasonable unlensed 
intensity pattern that describes the morphology as ob- 
served. 

In order to make best use of the reader's attention span 
we give a very brief overview of our lens modeling proce- 
dure below, and then expand on the technical details in 
subsection 12.21 We expect many might discreetly push 
ahead to Section [3] at this point. 

2.1. Overview of modeling procedure 

• We assume that most strong lensing galaxies have 
the smooth, symmetric morphologies of early-type 
galaxies, and subtract off the best-fitting set of con- 
centric elliptical isophotes. In practice we fit an 
elliptically-symmetric Moffat profile, which is very 
effective at removing galaxy bulges. Many bright 
red galaxies do show some disk component in their 
images; we identify these by their position and ori- 
entation relative to the major axis of the bulge, and 
mask out all pixels associated with the features. 

• We assume that these galaxies' lens potentials can 
be modeled adequately with singular isothermal 
sphere models plus external shear, and we grid 
this 3-dimensional parameter space with a coarse 
pixelization. The motivation for using this sim- 
ple model is partly empiri cal, based on the result s 
from the SLACS project (jKoopmans et al.ll2006D . 
and partly to keep the CPU time manageable. 

• We assume that the background source sizes, or an- 
gular scale over which background sources vary, is 
comparable to. or larger than the iJ/ST point spread 
function (PSF), and consequently ignore the latter 
when tracing flux back to the source plane. With 
any given lens model we perform a very crude ray- 
tracing to map image plane pixels onto a (finer) 
grid of source pixels. 

• In any putative multiple-imaging system, we re- 
quire (for identification as a lens) that at least 
some of the detected residuals lying in the multiply- 
imaged part of the image plane be explained by a 
multiply-imaged background source. 

• When mapping a set of multiply-imaged image- 
plane pixels back to the same source plane pixel, 



we take the minimum value of the image pixels' 
fluxes. This allows us to use all the information in 
the image, using dark sky to veto non-lensed bright 
sky. 

• Stepping through the entire grid of models, we 
select the one that maximizes this "minimum- 
filtered" source plane flux as our "optimal" model. 

2.2. Technical notes 

The image-plane and source-plane pixel grids we 
choose are determined by the pixelization of the input 
image. We use drizzled HST/ ACS images with pixel 
scale O.OSarcsec, and investigate 6arcsec cutout images 
centered on a pre-selected elliptical galaxy position. This 
selection is an important part of the lens search: one may 
consider using both color and morphology to pre-select 
elliptical galaxies, although this process is made consid- 
erably harder by the (possible) presence of the contrast- 
ing Icnsed images themselves. In this work we adopt the 
inclusive approach of investigating all bright extended 
objects, and exploring the performance of our technique 
on the selection criteria involved. To reduce computa- 
tion time, and improve the accuracy of the ray-tracing, 
wc bin the image plane by a factor of two. This reduces 
the number of deflection angles to be calculated by a fac- 
tor of four (and further justifies our ignoring the PSF at 
this stage). 

The grid of model space we choose for the lensing po- 
tential is three-dimensional: we use a si ngular is othermal 
sphere (SIS) wi th external shear (e.g. iKochanck, 199ll: 
iKormann et al.l [l994). The primary parameter is the SIS 
lens strength, specified as the angular Einstein radius 6e- 
Having a more subtle effect on the lensing behavior are 
the two parameters that describe the magnitude and di- 
rection of the external shear, 7x and These two 
parameters are vital, since a significant fraction of all 
known lenses are four image systems (quads). Without 
the external shear the SIS model can only produce two 
images. All points in this parameter space are treated 
as being equally likely prior to fitting each source, and 
are then stepped through in an exhaustive search. The 
coarseness of the grid was chosen to keep the CPU time 
per system low. We restricted the Einstein radius to lie 
between 0.4 and l.Sarcsec, large enough to include all 
SIS lenses with velocity dispersion in the range 160 to 
350km/s (given a lens at Zd = 0.5 and a source at red- 
shift Zs = 1.0). 

For each lens model we trace the image plane pixel val- 
ues back to the source plane via the usual lens equation, 

f3^e-a (1) 

where the optical axis was taken to be the centroid of 
the elliptical galaxy light, and the two components of 
the deflection angle a are given by 

"1 =^i-^+^x(eiCos20x + e2sin20x) (2) 
1^1 

Q;2 = 6*2 7^ +7X (6*1 sin 20X - 6*2 COS 20x) (3) 
1^1 

We use nearest-neighbor mappings; that is, we "snap" 
the lensing deflections "to grid." This is a time-saving 
device: rather than having square pixels in the image 
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plane map to distorted rectangles in the source plane, 
we are crudely approximating the source using a square 
grid. We will discuss the effects of this approximation in 
Section [4751 here we note that we do at least use a source 
plane that is twice as finely griddcd as the (binned) image 
plane. 

We are specifically interested in multiple-image sys- 
tems: some source-plane pixels will have more than one 
associated image-plane pixel mapping to each one, and 
in general those associated image-plane pixels will come 
from widely separated locations on the image plane. For 
each pixel in the source plane, we record the minimum 
intensity of the contributing image-plane pixels. In this 
"minimum-filtered" map, any image-plane pixel that is 
blank or low in intensity effectively "vetoes" any other 
contributions to the source-plane intensity in that pixel. 
For this reason, the minimum map in fact represents (in 
the absence of noise and pixelization effects) the only 
source-plane flux that can be contributing to the residual 
map, consistent with the multiply imaging lens potential 
model under consideration. This simple estimator com- 
pensates for our inability to model simultaneously and 
effectively all the features in the image, and focuses di- 
rectly on the component of the image we are interested 
in: the gravitationally-lensed component. 

When locating the best lens model for a given sys- 
tem, the scalar objective function we compute is simply 
the total flux in the minimum-filtered source plane. In 
many cases we expect this scalar to be zero, as the mini- 
mum filtering removes all unlensed fiux: this is certainly 
true of the blank parts of the image that are truly blank 
(rather than containing sky noise). For this reason we 
"de-noise" the galaxy-subtracted images by setting all 
pixels not containing significant flux to zero. This thresh- 
olding is do ne with standard object detection software 
(SExtractor. iBertin fc ArnoutsI Il996l ). which associates 
pixels together into objects and flags the remainder as 
background. Specifically, we use the "objects" chcckim- 
age output as our data. 

At each value of the model Einstein radii 9e, we max- 
imize the source plane flux by stepping through the ex- 
ternal shear parameters. This allows us to plot source 
plane flux versus 9e- The optimal model is then the one 
corresponding to the peak of this plot. 

2.3. Multi- filter data 

So far we have made no mention of the achromatic na- 
ture of gravitational lensing, yet this is one of the most 
important pieces of information at the disposal of any by- 
eye lens searcher. This is because the human eye is very 
good at detecting subtle changes in color, and in pick- 
ing out objects of a given color from a noisy background. 
However, the achromaticity is a direct result of surface 
brightness being conserved at all wavelengths; up until 
this point we have just described using surface bright- 
ness conservation in one band. If imaging data in more 
than one band is available to us, how should we include 
this information? We expect the source morphology to 
be different between the different bands, so the lens in- 
versions should be done independently. However, it must 
be the same mass distribution giving rise to any lensing 
effects detected. 

The correct thing to do when analyzing the indepen- 
dent datascts (in this case, the images in different flltcrs) 



is to add the log likelihoods for each band's image to- 
gether. Dropouts (sources with no detected flux in one of 
the bands) will rightly contribute zero to the summed log 
likelihood. For computational efficiency we are not opti- 
mizing the lens model by maximizing its likelihood, but 
we can compute the likelihood of the "optimal" model 
once it has been obtained. In Section [3] below we will 
do exactly this - and combine the likelihoods from the 
different filters images as well. While multiple bands can 
be analyzed in this straightforward way, we note that 
our method does not rely on having multi-filter data ~ 
although we expect performance to improve in the multi- 
filter case. 

One use of any additional filters could be in improving 
the pre-selection of candidate elliptical galaxies. Another 
is that the redder images tend to give more accurate lens 
galaxy centroids (since the redder images have more reg- 
ular morphologies better tracing the dark matter halo). 

2.4. Modeling examples 

In Figure [1] we show three example bright red galaxies 
(drawn from the EGS survey, see Section [5]) , and their 
automated analysis. One is a lens (visually selected by 
M07), and two are potential false positives showing con- 
fusing "lens plane" structure. In each case the robot 
is able to find a lens model that explains some of the 
observed galaxy-subtracted residuals, but with varying 
goodness of fit; in the next section we describe how we 
use this and other information from the robot to make 
an automated classification. 

3. AUTOMATED LENS CANDIDATE CLASSIFICATION 

Given an optimized lens model, we now generate a set 
of data describing the quality of the model in describing 
the observed image. A classification parameter can then 
be inferred from this robot output data vector d, whose 
components we describe below. 

3.1. The robot's output: quantifying lens model quality 

The source plane flux scalar described above is sim- 
ple, and fast, to calculate. However, a much better- 
understood objective function is the log likelihood of the 
predicted source model. To calculate this, we invert the 
lens mapping to produce the corresponding image counts 
predicted by the lens and source model, /p. Assuming 
the Gaussian approximation to the Poisson distribution 
for the (assumed uncorrelated) pixel noise on the data 
image counts /, the log likelihood is given by 

The uncertainty ct^ is estimated by adding in quadrature 
the root mean square pixel value measured in the back- 
ground regions, to the square root of the i**^ detected 
pixel value (thus taking into account, at least approxi- 
mately, the Poisson noise due to the objects themselves). 
We add a further term, also in quadrature, to attempt 
to model the errors incurred by our very simple lens and 
coarsely-gridded source models: since the inadequacies of 
our model are independent of the quality of the data, we 
assume the modeling error will impose a signal-to-noise 
ratio floor a, so that its contribution to the overall uncer- 
tainty can be (crudely) approximated by li/a. We find 
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Fig. 1. — Robot modeling inputs and out puts for three example objects from the EGS survey, one in each row of panels: Top row, the 
"Anchor" gravitational lens found by eye bv lMoustakas et al.l 1 120071) : middle and bottom rows, potential false positives, showing confusing 
lens plane spiral arm structures, some of whose flux is consistent with having been lensed. In each row, numbering the panels from the 
left, we have: 1) the raw HST color cutout image; 2) the image after subtraction of a Moffat-profile model for the lens galaxy light; 3) the 
thresholded image input to the lens modeling robot; 4) the scalar objective function (minimum-filtered source plane flux, see text) plotted 
as function of Einstein radius and optimized over the external shear parameters; 5) the optimal model source plane; 6) the corresponding 
image plane. The images in panels 3 and 6 arc used in computing the goodness of fit statistic ria- (Section 13.11 1. Likewise, in panel 4 a 
Gaussian has been fitted to the peak of the scalar for the purposes of estimating the uncertainty on the Einstein radius, 50-^, and the 
detection ratio R^. All cutout images are 6 arcsec on a side. 



that a = 4 gives reasonable values of log L for a range of 
data quality. In summary, 



+ h + {h/af 



(5) 



The sum in equation |4] is over all non-zero pixels in 
the predicted image, since we are only attempting to ex- 
plain the flux in the multiply-imaged region around the 
putative lens. In principle, one could use the presence 
of additional residual image structure that was not com- 
patible with being lensed as evidence that perhaps there 
is no lensing occurring at all. However, we prefer at this 
stage to be inclusive in the generated candidate list and 
only consider pixels in the multiple-imaging regime. The 
parameter 6 is a rescaling factor that allows us to cor- 
rect the bias towards low source flux introduced by the 
use of the minimum-filtered source plane intensity when 
computing the predicted image /p. We approximately 
marginalize this parameter out by using the value of b 
that maximizes log L. 

The left-hand side of equation |4] is distributed as chi- 
squared for approximately N degrees of freedom, where 
N is the number of non-zero pixels in the predicted im- 
age minus three model parameters. Therefore, we record 
the goodness of fit as the number of standard devia- 
tions Ua- this scalar is from the mean of the chi-squared 
distribution, using the Fisher approximation for large N: 
Pr{2x^\N) w G{2N - 1,1) (where G{m,w) is a Gaus- 
sian distribution of mean m and standard deviation w). 
The image-plane likelihood evaluation is time consum- 
ing; however, as we use the source plane flux scalar to 
quickly determine the "best" model, we need only com- 
pute Tier once. The is the first component of the robot 
output data vector d. 



Our coarse parameter grid and simple lens model be- 
come increasingly ill-suited to the data as the source 
size decreases. For us to detect any flux in the source 
plane the mass model must be good enough to map the 
pixels back to the same point with sub-pixel precision. 
We are helped by the PSF here, which spreads the flux 
around in the image plane, but the resulting minimum 
source plane produces a very sparse image plane. We ac- 
count for this in the likelihood evaluation by applying a 
"restoring beam" to the predicted image. This beam is a 
Gaussian with FWHM shghtly less than that of the ACS 
PSF, and produces more realistic image configurations. 
Since we originally mapped the convolved flux back to 
the source plane, we do the restoring convolution such 
that the peak surface brightness of the source plane is 
preserved into the predicted, convolved, image plane. 

The parameter is clearly a powerful tool for quan- 
tifying how good a fit to the data a lens model provides. 
However, it does not tell the whole story. For example, 
very faint sources give very faint fluctuations in the image 
plane, which can often easily be fitted within the noise by 
a lens model. However, a human classifier is unlikely to 
assign this candidate a high score. We therefore make the 
magnitude of the source the second component of d. We 
also add two more quantities. First is the uncertainty on 
the lens model Einstein radius, SOe' a convincing lens 
model should give a very small 69^- We estimate this 
from a Gaussian approximation to the peak in the plot 
of source plane flux against Einstein radius (Figure [TJ 
panel 4). The second is related to this but subtly differ- 
ent: in a clean lens with just one source plane, we expect 
only one lens model to match the data, and so this same 
plot should have only one sharp peak. Therefore, as our 
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final d-component wc take the ratio i?d of the integral 
under the Gaussian fit to the integral under the whole 
source flux curve. 

The four scalars described here that comprise d were 
chosen purely for their intuitive information content. We 
are not suggesting they form a complete set, or that they 
have been optimized for this application - only that they 
are likely to provide some discriminatory power between 
different candidates. Indeed, we anticipate that much of 
the future work improving the performance of the robot 
might center on generating better d-vectors, and so im- 
proving the "capacity" of the robot to learn. We will 
return to this issue in Section [6l 

We list the robot outputs d for the three examples 
shown in Figure [1] in Table [1]- this table illustrates some 
of the behavior suggested in the above discussion. 

3.2. Inferring the human classification parameter 

Having defined our set of scalars describing the quality 
of a lens model (the robot output d), we must now de- 
cide on our criteria for assessing the robustness of each 
strong lens candidate. Before we continue, let us try to 
understand this last statement. Naively we might hope 
to quantify the quality of a lens candidate by seeking the 
"probability that it is a gravitational lens." However, to 
calculate this it would also be necessary to compute the 
probability that the candidate is NOT a gravitational 
lens as well. The problem is that we do not yet have 
a sufficiently detailed and quantitative understanding of 
galaxy morphology to be able to do this. In the pres- 
ence of disk components, spiral arms, satellite galaxies 
and so on, it is genuinely difficult to disentangle the con- 
tributions to the image from the lensed sources versus 
from the contribution from the candidate galaxy itself. 
Clearly, if the value of the detected source plane fiux is 
zero then the system can be rejected. But we can expect 
a further continuum of likelihood values: if the robot is 
to succeed in returning a useful, shortened candidate list 
for our inspection, it must first learn what makes a good 
candidate. 

While our quantification of galaxy structure is poor, 
our ability to identify gravitational lenses in high reso- 
lution images by eye is not. This suggests that a much 
better-defined measure of lens candidate robustness is 
the posterior probability distribution Pr{H\d), where 
H is the classification that a human would have assigned 
the system. 

In Table [2] we propose a simple four-point system for 
human classification of gravitational lenses. Our experi- 
ence is that fewer than four classes is not flexible enough 
to describe a set of lens candidates, while the differences 
between any more than four classes become too small for 
different classifiers to agree upon. 

If we have a set of lens candidates, each with robot 
data vector d^ and known human class Hi, then we can 
estimate the probability density function (PDF) Pr{d\H) 
from these clouds of points in d-space, H-value by H- 
value. We can then use this model PDF to compute 
Pr(dj|7?) for the j'^ candidate. Pr(7J|dj) is given by 
Bayes' Theorem, 

Pr(if|dj) cx FT{dj\H)PT{H). (6) 

The prior PDF Pi{H) encodes our expectations for the 
relative frequencies of each human classification H . One 



can imagine that there might be a great many low-H 
(poor quality) candidates and only a few high-77 (and 
so very robust) candidates. In Section below we will 
explore two quantified prior PDFs. 

To make progress we now need to determine Pi{d\H). 
For this we require a large sample of candidates, con- 
taining both lenses and non-lenses, and showing similar 
problems to real candidates. In the following subsection 
we describe the two sources of this training set, and the 
resulting education of our robot. 

4. TRAINING THE ROBOT 

Having implemented an efficient lens modeling robot, 
we now have to transfer to it our knowledge of lens, and 
more importantly, intrinsic BRG structure. The proce- 
dure is to generate a well-defined set of lens candidates 
whose human classification H is known, and that sample 
well the possible range of i7-values. 

4.1. Non-lenses in the EGS survey 

The 63 ACS images^ of the 0.18 de^^ HST mosaic of the 
Extended Groth Strip survey (EGS. iDavis et al.l[2007h 
were taken in both the F814W ("/sM-band") and F606W 
("V'eoe-band") filters, to depths of 28.14 and 27.52mag 
respectively (S-cr limits for point sources in 0.12arcsec- 
radius circular measurement apertures). This survey 
contains a control sample of three strong lenses con- 
firmed by their image modeling (and in two cases by 
spectroscopy as well) , and four plausible strong lens can- 
didates, all identified by an independent search by visual 
inspection of the ACS frames (M06 & M07). We wiU use 
this sample in Section [5] for comparison with our auto- 
mated search results. Here, we aim to produce a catalog 
of BRGs that includes these known lenses, and then de- 
fine a sample of known non-lenses with which to educate 
the robot. 

To avoid any morphological bias in the candidate lens 
galaxies, we only minimally pre-filter the catalog of ob- 
jects detected in the EGS fields, applying only a signal- 
to-noise and a color cut. Objects with /814 < 22.0 and 
(^06 —-^814) > 0.8 were selected, giving a sample of 1085 
BRGs. All of the confirmed lenses, and three of the four 
lower-quality lens candidates, identified by M06 & M07 
passed this cut. For our training set of non-lenses, we 
divided the 63 EGS fields into two sets, a training set of 
3 fields, and a survey set of 60 fields. The three training 
fields were chosen to contain none of the lens M06/M07 
candidates: the fields chosen were egs2101, egs2102, 
and egs2103. We then analyzed the 97 BRGs in these 
three training fields that passed our color- magnitude cut, 
generating optimized lens models for each object. These 
systems were classified by one of us (PJM), and binned 
by i/- value. 

4.2. Simulated lenses 

Sampling BRGs that are lenses is harder - strong grav- 
itational lenses are rare objects. Many of the known 
galaxy-scale lenses were found eithe r via source-oriented 
radio source or qu asar surveys (e.g. iBrowne et al1l2003l : 
lOeuri et al.l[200l. or in un resolved spectroscopic sur- 
vevs (e.g. iBolton et al.ll2006[ ): they tend to have either 

^ Available from [http : / /aegis .ucollck ■ org] 
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TABLE 1 

Robot outputs for the three example lens candidates in Figure [T] 



Object name 


Position in 


n„ 






i?d 


Souree magnitude 


Human class 


Robot class 




Figure nn 




(arcscc) 


(aresee) 




AB, F606W 


(H) 


(Hr) 


HSTJ141833. 11+524352.5 


top 


4.11 


0.84 


0.06 


0.62 


28.40 


3 


2.9 


HSTJ141856. 16+523843.5 


middle 


8.93 


0.48 


0.05 


0.27 


30.23 





2.1 


HSTJ141828. 06+523646.1 


bottom 


21.56 


0.54 


0.12 


0.56 


27.96 





1.4 



TABLE 2 

Human lens classification system used in this work. 

Class H Meaning 

Definitely not a lens 

1 Possibly a lens 

2 Probably a lens 

3 Definitely a lens 

point-like or highly-magnified images, and are therefore 
not representative of the lenses we expect to find in high 
resolution imaging surveys. 

The best training set for our purposes is then a sam- 
ple of lenses whose properties match those of the lenses 
we expect to find in a high resolution optical imaging 
survey. To this end we simulated a realistic popula- 
tion of lenses and generated mock images of them. For 
the lenses, we used a set of 68 morphologically-selected 
early-type galaxies from the EGS survey ACS images. 
We made a very rough estimate of their redshift, by as- 
suming a velocity dispersion of 220km/s and using the 
Faber-Jackson relation to estimate luminosity, and so 
distance modulus given apparent /814-band magnitude. 
We then drew a velocity dispersion from a Gaussian dis- 
tribution of mean 220km/s and width 20 km/s. The 
variety of redshifts and velocity dispersions is sufficient 
to give a reasonable distribution of Einstein radii. We 
also measured the position, ellipticity and orientation of 
the early- type galaxy light: these parameters were then 
used, along with the model velocity dispersion, to de- 
fine an SIE model lens for each lens galaxy. To complete 
the lens model we added external shear with amplitude 
drawn fro m a log-normal distribution of mean 0.05 and 
width 0.6 ([Hold er .4L^chcchtcr 2003), and external con- 
vergence equal to half the external shear magnitude. 

For the sources, we drew faint galaxies from 
the EGS catalog, extracting their magnitudes, sizes, 
ellipticities and orient ations and us ed the se data 
to define a simple Ide Vaucouleurd (|1948[ ) bulge 
plus exponential disk model. We made a ro- 
bust faint galaxy selection (22 < /814 < 27, and 
0.15 arcsec < FWHM < 0.36 arcsec) and approximately 
corrected the sizes for the ACS PSF by subtracting its 
Gaussian-approximated width in quadrature; the result- 
ing corrected half-light radius was then asserted for both 
components of the disk-|-bulge model. We drew the 
disk/bulge ratio from a Gaussian of mean 3.0 and width 
0.5, since we expect most faint blue source galaxies to be 
disk-like. For the source redshift we again approximated 
it using the /8i4 magnitu de, following a simpli fied ver- 
sion of the recipe used bv lMassev et al.l ()2004bf ): Pr(2;s) 
was assumed to be a Gaussian of width 0.4, centered on 
z„ = 0.7 + 0.15(7814 - 22). 

We generated a number of possible sources for each 
lens galaxy, providing more lens systems than early-type 
galaxies. Once we had both source and lens redshift, 
and SIE velocity dispersion parameter, we computed the 



Einstein radii and rejected systems with 9e < 0.4 arcsec 
as being unobservable (Section 12. 2[) . We then generated 
a source position drawn uniformly from a box of width 
0.36'e centered on a point offset by (0.15, 0.15)6'e from the 
optical axis of the lens. This rather complex recipe was 
designed to avoid an overabundance of ring-like lenses 
while keeping us within the reasonably high magnifica- 
tion regime. Finally, we shuffled the list of simulated 
lens systems and selected the first lOO systems to make 
simulated images. 

Figure [2] shows the redshift and magnitude distribu- 
tions of the simulated lenses. The source redshifts can 
be seen to be roughly consistent with h aving been drawn 
frorn a distribution like that derived bv lLeauthaud et al.l 
((20071 ) for the appropriate EGS magnitude limit, once 
the lensing (which favors higher source redshifts) has 
been taken into account. The magnitudes of the lenses 
are consistent with the bright (and hence massive) end 
of the early type population (as characterized using the 
morphologically-sele cted and sp ectroscopically-observed 
GOODS sample of iTreu et all [20051. The (unlensed) 
magnitudes of the sources are consistent with their par- 
ent EGS population, with the piling-up at magnitude 26 
being due to a combination of the magnitude limit and 
the lensing shift to higher mean redshift. 

We made mock, noise-free ACS images of the lensed 
images given the lens model described above; the process 
is the same as described (a l beit i n an inferential setting) 
in M07 and lMarshall et al.l (|2007t ). We approximated the 
ACS PSFs by a Gaussian of FWHM = 0.14 arcsec and 
convolved the simulated images with this kernel, before 
adding it to the relevant EGS early-type galaxy image 
(in each filter). The resulting composite image has ap- 
proximately the correct noise properties: we neglected 
the small contribution to the noise from the lensed im- 
ages, as the lens galaxy surface brightness is often higher 
anyway. The end result is a set of 100 simulated lens 
galaxy cutout images that can be analyzed in an iden- 
tical manner to the EGS lens candidates themselves. A 
gallery of examples is given in Figure [3l 

4.3. Modeling Pr(d|i7) 

We divided our training set (100 simulated lenses, plus 
97 non-lenses from EGS) into 4 i7-value bins, and for 
each H bin, plotted all of the robot output parameters 
in the vector d against each other. These parameters 
were assumed to have been drawn from the distribu- 
tion Pr(d|i7); from the plots we estimated the number 
of modes of this PDF, and selected the samples associ- 
ated with each mode by a series of orthogonal cuts in 
the values of d. The modes were then modeled as multi- 
variate Gaussians, defined by the means and covariance 
matrix of each mode. After normalizing each mode by 
the number of samples used to compute the properties 
of the mode, the overall PDF Pr(d|i7) for that value of 
H is given by the simple sum of these Gaussian modes. 
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Fig. 2. — Redshift (left) and /814-band magnitude (right) distributions for lens and source galaxies in the 100 simulated lens systems used 
in this paper. Green solid histograms show simulated lens galaxies, red dashed histograms show simulated source galaxies. Blue dotted 
curves show the parent source population - the redshift distribution from lLeauthaud et al.l 1 20071) and the EGS source counts. The black 
dot-dashed histogram is the morphologically-selected spheroid sample from lTreu et al! I II2OO5I ). 




Fig. 3. — Mock EGS data for 12 example simulated gravitational 
lenses. All cutout images are Oarcsec on a side. 

In some cases the modes were broadened to provide 
a more satisfactory picture of the PDF; in some cases 
the Gaussian approximation breaks down leaving an ill- 
fitting distribution. This is of course a very subjective 
way of modeling the PDF, but this does not matter: all 
we are attempting to do is transfer our expertise to the 
robot such that it agrees with us in the generation of 
H-values. Since these are themselves subjective there 
seems little point in insisting on objectivity at every turn. 
While we can see there is room for improvement in the 
accuracy of the PDF model, the robot's more important 



quality is that of reproducibility: having taught it about 
human classification, the robot will be consistent in its 
own classification (unlike humans) , such that a selection 
function can be estimated from realistic simulated data. 

In Figures |3]-[7] we show plots of all four PDFs modeled 
in this way: Pr(d|H = 0), Pr(d|i7 = 1), Pr(d|i? = 2), 
and Pr(d|i7 ~ 3). These plots provide the quantifica- 
tion of the common sense we would like the robot to 
have when inspecting lens candidates. In each panel, the 
contours enclose 68% and 95% of the marginalized prob- 
ability density. 

4.4. Robotic lens classification 

In this Section we describe the automated classification 
of the training set, and estimate the completeness and 
purity of the resulting sample. 

We use the PDFs described in the previous Section to 
compute the posterior PDF for the classification param- 
eter H as follows: 



Pr(i7|d) cx ^Pr(d|iJ = i)Pr{H = i), 



(7) 



and then choose as our estimator for H the posterior 
mean, denoted (where "r" is for "robot"): 



Hr = 



E,^Pr(g^^|d) 
E,Pr(i/ = i|d) 



(8) 



We note that H^, the robot's estimate of H, is a contin- 
uous variable even though H is not. In Section [5] below 
we combine human classification parameters from several 
human inspectors, and so there H docs not take integer 
values either. 

In order to infer the human classification H we need 
to specify the prior probability Pt{H). This is exactly 
what the human classifier is doing when reminding her- 
self that lenses are rare, so that class-3 candidates should 
be considerably rarer than class-0 objects (which make 
up the majority of objects). In this spirit we might as- 
sign a prior based on what we expect the fractions of the 
different classes might be. A reasonable estimate for the 
number of BRGs acting as lenses is 1 in 1000, and we 
might hope for, say, 20 times more probable lenses than 
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Fig. 4. — Pr(d|H = 0) derived from the robot outputs 
(points) for the training set. Points correspond to objects 
classified by a human (PJM) as class H = 0, definitely not 
a lens. In this and subsequent PDF figures the contours 
enclose 68% and 95% of the total probability. 




Fig. 6. — Pr(d|H = 2) derived from the robot outputs 
(points) for the training set. Points correspond to objects 
classified by a human (PJM) as class H = 2, probably a lens. 




Fig. 5. — Pr{d\H = 1) derived from the robot outputs 
(points) for the training set. Points correspond to objects 
classified by a human (PJM) as class H = I, possibly a lens. 




Fig. 7. — Pr{d\H = 3) derived from the robot outputs 
(points) for the training set. Points correspond to objects 
classified by a human (PJM) as class H = 3, definitely a 
lens. 



actual lenses. With a similar argument for class 1, we 
end up with Pr(i7) = [0.9,0.08,0.019,0.001]. 

With this assumption, we calculated the posterior 
mean for each object in the training set, and com- 
pared it with its true (human-determined) value of H. 
This comparison is shown as a "completeness chart" 
in the far left-hand panel of Figure [51 Completeness 
C{H; Hy) is defined as the percentage of objects with 
human class H that have robot class H^. For example, 
in Figure [H 19% of objects with human class H = 3 
were found to also have robot-assigned class ~ 3. In 
this chart, the percentages sum to 100 in columns. The 
ideal robot would give a completeness chart that would 



be white (0%) everywhere except on the diagonal, where 
it would be black (100% complete). 

The apparently low success rate of the robot at high-i/ 
must be measured against its performance at low- 77: an 
impressive 98% of definite non-lenses are rejected. This is 
just the usual trade-off between completeness and purity. 
Defining purity P{H^\H) as the percentage of objects 
with robot class that actually have human class H , 
we can plot this as a complementary purity chart - ex- 
cept that now, the proportions of objects in each human 
class H-hin need to be realistic for the numbers to make 
sense. Our training set does not have realistic propor- 
tions of objects of each class (nearly half are known to 
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be lenses); this is fine for estimating completeness, but 
must be corrected when estimating purity. Here we sim- 
ply adjust the calculated purities to what they would 
have been had the proportions been those assumed in 
the prior Pr{H) defined above. In the purity chart, the 
percentages sum to 100 in rows. The ideal robot would 
give a purity chart that would be white (0%) everywhere 
except on the diagonal, where it would be black (100% 
pure). We show the purity chart for the realistic-prior 
robot in the second left-hand panel Figure [51 

High purity means high efficiency: with a realistic prior 
for the abundance of each class of candidates, the robot 
class 3 sample is 100% pure. This is certainly very en- 
couraging for future surveys, where the human inspec- 
tion is time consuming and costly. However, in the short 
term we might prefer a relatively impure sample of can- 
didates to be output by the robot, in return for a higher 
completeness. We can achieve this by altering the prior 
on H, analogous to having a inspector of different char- 
acter. An "optimistic" robot might have a prior PDF 
Pr(iJ) = [0.05,0.10,0.25,0.60], which, while not reflect- 
ing our expectations of the abundance of lenses at all, 
has the practical effect of increasing the chances of a 
high classification value. In the rightmost two panels 
of Figure [8] we show the completeness and purity of the 
samples generated by a robot of this character - its be- 
havior is now such that the purity of the Hp = 3 bin is 
low (1%), while the completeness at 7? = 3 is rather high 
(89%). The optimistic robot's classifications for the ex- 
ample systems in Figure [T] are shown in the final column 
of Table □ 

Finally, for comparison we also show the performance 
of a "naive" robot - one whose prior is uniform in H - 
in Figure [9l While this is something of a compromise 
between optimism and realism, it is not designed to be 
such; it is neither complete enough nor pure enough to 
be useful. This should not be a surprise: the prior asso- 
ciated with this robot is one of maximal ignorance. We 
summarize the prior PDFs associated with each robot 
character in Table [3l 

TABLE 3 

Robot characters and corresponding prior PDFs Pr{H). 



Character 


Pr{_H' = 


0) Pr(_H' = 


1) Pr(_H' = 


2) Pr(_H' = 3) 


Realistic 


0.900 


0.080 


0.019 


0.001 


Optimistic 


0.050 


0.100 


0.250 


0.600 


Naive 


0.250 


0.250 


0.250 


0.250 



If an unrealistic prior is employed, a 100% pure sam- 
ple of discovered lenses can be generated by a final hu- 
man classification of a subset of robot-selected candi- 
dates. The cost of the optimistic robot is that in each 
square degree some 670 robot class-3 candidates must 
be visually inspected by a human to recover the 9 lenses 
present. To be 100% complete in human class H = 3 sys- 
tems, the robot class = 2 systems must be inspected 
as well - and there are some 4500 of these. As well as 
the percentage purities, we also show (in parentheses) 
on the purity charts in the figures the predicted num- 
bers of objects in each bin for a 1-square degree survey 
area (assumed to contain 10^ BRGs). Tabic |3] shows the 
overall completeness and purity for various search strate- 
gies - e.g. having humans inspect all objects with robot 



class above some threshold - making clear the trade- 
offs involved. We illustrate the numbers with two fidu- 
cial imaging surveys, representing approximately what is 
possible now using HST archive images, and in the fu- 
ture with a space-based optical survey telescope such as 
SNAP. 

A crucial practical aspect of future large-scale surveys 
is the need to cope with the extreme numbers of BRGs 
involved. To quantify this we compute the rejection rate 
(the percentage of BRGs that are rejected by the robot 
and not passed to the human classifier) , and the number 
of candidates needed to be inspected by humans. From 
this we estimate the time required to carry out the human 
classification step. 

Table [4] shows the rejection rates and expected clas- 
sification time At for our two fiducial imaging surveys, 
assuming an average of 10 seconds inspection time per 
object, and a team - for the SNAP survey - of 10 human 
inspectors. We see that a 20% complete SNAP sam- 
ple generated by the realistic robot would contain 2000 
lenses and no false positives, and require negligible clas- 
sification time (~ 1 hour). A ~ 90%-complete sample 
consisting of 670,000 candidates could be generated by 
the optimistic robot, and human-classified in 5 weeks. 
We note that the CPU time (in 2008) for the current 
(and unoptimized) robot, on a 100-node compute farm, 
is approximately 2 hours per square degree, or 10 weeks 
for the SNAP survey. 

To put this in context, the M07 search was carried out 
by one of us (LAM) inspecting "3-color" JPG images of 
all 63 ACS frames, a procedure that might be expected 
to be inefficient and prone to error. Indeed, the search 
took 10 minutes per frame, or 60 hours per sq degree. At 
the same rate the 1000-square degree survey would take 
10 full-time workers 3 years to complete. Just targeting 
massive galaxies and inspecting sequences of small cutout 
images leads to a significant increase i n efficiency (e.g. 
iFassnacht et al.ll200llFaure et al.ll2008[ ): however, visual 
inspection of every elliptical galaxy would still take 70 
weeks with the same inspection team. While this is a fac- 
tor of 2 improvement over the EGS search rate, the robot 
brings the cost down further by reducing the number of 
systems needed to be human-inspected: at ~ 90% com- 
pleteness the BRG rejection rate is ^ 90%. Furthermore, 
we found that displaying the candidates to the human 
classifier via a "one-click" web classification tool led to 
a significant reduction in the time needed to assess each 
one. Optimizing the information at the inspectors' dis- 
posal should allow human classification to be performed 
at an average rate of just a few seconds per candidate. 
The survey classification time estimates in Table H] are 
therefore quite conservative. 

4.5. The accuracy of the robot outputs 

We now investigate the robot's modeling results for the 
54 simulated lenses classifcd as H = 2 (17 objects) and 
H = 3 (37 objects). How accurate, and hence useful, are 
the lens model parameters inferred by the robot? Since 
the lens model used by the robot is simpler than that 
used to generate the simulations, we restrict ourselves to 
an investigation of the Einstein radius, which we might 
hope to infer reasonably accurately given the relatively 
small shears and ellipticities involved in the simulations. 
We have the true, underlying values of 9e for each sys- 
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Completeness — realistic Purity — realistic Completeness — optimistic Purity — optimistic 
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Fig. 8. — Left two panels: completeness (left) and purity (right) charts for a realistic prior distribution of //-values, namely one where 
the object classifications are as expected given what we know about the relative scarcity of gravitational lenses. Right two panels: same, 
but for an optimistic prior distribution of //-values, namely one where the majority of BRGs are expected to be acting as observable 
gravitational lenses. Completeness values are the percentages of the total number of candidates of true class H in each robot class bin 
- these percentages sum to 100 in columns. Purity values are the percentages of the total number of candidates in each robot class 
bin that have true class H - these percentages sum to 100 in rows. The numbers in parentheses on the purity charts are the approximate 
expected numbers of objects in each bin for a 1 square-degree survey. 



TABLE 4 

Lens search strategies, yields and statistics. 



Strategy 


HST yield (1 deg^, lO"* 


LRGs) 


SNAP yield (1000 deg^, 


10^ LRGs) 




Statistics 




Character 


Hr cut 




At 


Men. 




At 


Mens 


Rejection rate 


Purity 


Completeness 








(man-hours) 






(team-weeks) 




(%) 


(%) 


(%) 


realistic 


H, > 0.5 


399 


1 


5.4 


399000 


2.8 


5400 


96 


1.4 


54 




Hr > 1.5 


43 





3.2 


43000 


0.3 


3200 


100 


7.5 


32 




Hr > 2.5 


2 





1.9 


2000 





1900 


100 


100 


19 


optimistic 


Hr > 0.5 


6832 


19 


10 


6832000 


47.4 


10000 


32 


0.1 


100 




> 1.5 


4497 


12 


10 


4497000 


31.2 


10000 


55 


0.2 


100 




Hy > 2.5 


672 


2 


8.9 


672000 


4.7 


8900 


93 


1.3 


89 


naive 


Hr > 0.5 


5769 


16 


10 


5769000 


40.1 


10000 


42 


0.2 


100 




i/j > 1.5 


679 


2 


8.4 


679000 


4.7 


8400 


93 


1.2 


84 




Hy > 2.5 


89 





6.5 


89000 


0.6 


6500 


99 


7.3 


65 



Completeness - naive Purity - naive 




Fig. 9. — Completeness (left) and purity (right) charts for a naive 
prior distribution of //-values, namely one where all classifications 
are considered equally likely to occur. 

tern, which wc denote as 0e- 

In the left-hand panel of Figure [TO] we plot histograms 
of A9e = (6'e — &e)/^Oe for systems with human class 
H = 3 and H = 2 (which should each look like a Gaus- 
sian centered on zero having width unity). A peak of 
roughly correct width and centroid can be seen for the 
robust {H = 3) lenses, indicating that the robot's mod- 
eling is quite meaningful in some cases but somewhat 
inaccurate in others. However, the histogram has sig- 
nificant tails, especially at large inferred 9e- Some of 
this positive tail will be due to the presence of external 
convergence in the simulated image, but only at the few 
percent level. For the less convincing candidates, the his- 
togram is broader with a less pronounced central peak. 



as expected. 

We might also hope to use the robot output to learn 
about the source galaxy: in the right-hand panel of Fig- 
ure [TO] we plot a similar histogram of Arrig = (tos — rhs). 
While we do not infer an uncertainty Suis, the unit width 
Gaussian still provides a useful reference. A ^2 mag 
offset in the unlensed source magnitude can be seen, re- 
flecting the inability of the lens model to account for 
all the flux. This is a consequence of using necessarily 
(to save CPU time) inaccurate lens models: when the 
model does not quite predict the image correctly, there 
is some mismatch between the different multiply-imaged 
pixels' values. The minimum-filtering process then leads 
to an underestimate of the corresponding source plane 
pixel value. At the edges of the detected image features, 
this can lead to an unwanted zero value in the source 
plane, and so to an inferred source that is not only too 
faint, but also too small. The total flux of the source is 
then underestimated, and the rescaling performed before 
calculating is not enough to recover the lost flux. 

The robot's source magnitude estimates are therefore 
biased low. While not useful for source studies, mg is 
still a valid indicator of the model quality; indeed, the 
discussion above shows why this so. 

5. BLIND TESTING ON THE ECS SURVEY 

Having calibrated the lens-flnding robot on the train- 
ing set, wc now present its application to the 60 remain- 
ing EGS "survey" fields fSection 14. 1|) . Table [5] summa- 
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Fig. 10. — Left: accuracy of the model parameter 6e and its "uncertainty" 66^. Plotted are histograms of A6e = (^E — 6e)/(59e for 
the top two human classes, compared with a Gaussian of mean zero and width 1. Right: similar plot for the inferred (unlenscd) source 
magnitude (in the F606W band) mg - here there is no uncertainty (5ms, but the unit width Gaussian still provides a useful scale for 
comparison. 



TABLE 5 

egs search strategies, yields and statistics. 
Purity P, completeness C, and rejection rate R are all given as percentages. 



Character //^ cut 


N{H = 


3) N{H > 2) 


R 


P(ff = 3) 


P{H > 2) 


C(ff = 3) 


CiH > 2) 


realistic Hi- > 2.5 








100 


0.0 


0.0 








Hr > 1.5 





1 


96 


0.0 


2.4 





10 


optimistic //^ > 2.5 


1 


4 


89 


1.0 


3.8 


33 


40 


Hr > 1.5 


3 


9 


46 


0.6 


1.7 


100 


90 
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rizes the results of this run. All 988 lens candidates in 
these fields were visually inspected by a subset of the au- 
thors (PJM, DWH, LAM, MB, CDF), and their if-values 
simply averaged. (As noted before, this gives non-integer 
values of H.) The color images, before and after lens 
galaxy subtraction, were displayed to aid them in their 
decision-making via the egi web form. 

We ran both realistic and optimistic robots while per- 
forming the automated classification; Table [5] shows the 
statistics from these runs. Allowing for the small num- 
bers involved, we find that the completeness and pu- 
rity achieved by the robot are consistent with the pre- 
dictions from the training set. With the realistic prior 
we find that none of the three H = 3 systems are re- 
covered, consistent with the expected low completeness 
(Table 11]). With the optimistic prior, we achieve 100% 
completeness in H = 3 systems when setting the robot 
classification threshold to H,^ > 1.5, and 33% complete- 
ness with Hi > 2.5. That the latter is slightly lower 
than expected is a reflection on the complexity of two of 
the systems: as noted in M07, HST J141820.84-h523611.2 
(the "Dewdrop") has a very extended source, providing 
a lot of confusing structure and so fairly high values for 
i?d and S0E. HST J141735. 73-^522646.3 (the "Cross") 
has almost point-like images in an asymmetric pattern, 
caused by the combination of both internal ellipticity and 
external shear in the lens (e.g. M07). This leads to a 
poor model fit and a high value for n^. The rejection 
ratios ensuing from the optimistic search strategies (46 
and 89%) match well the predictions from the training 
set in Table 12 (55 and 93%). 

Figure [TT] shows all 10 objects with human class H > 
2.5 (the "A-list", three objects) or 1.5 < H < 2.5 (the 
"B-list", seven objects). The robot output data are tab- 
ulated in Table [SI The B-list systems include the three 
that were noted during the M06 & M07 by-eye searches - 
four are new lens candidates (albeit of low quality). One 
B-list system was rejected by the robot, a result of its 
large image separation. 

These four new B-list candidates illustrate an impor- 
tant point, namely that automated searches ameliorate 
the problem of human error in a by-eye search. By fo- 
cusing on the few galaxies consistent with being lenses, 
the distractions of the rest of the image are removed and 
there is less chance of missing an interesting object. 

6. DISCUSSION 

In this Section we first identify several areas where our 
robot's performance could be improved, and then com- 
pare our approach with others suggested in the literature. 

6.1. A more extensive training dataset 

The blind test on real survey data described in Sec- 
tion [5] suggests that our simulations are sufficiently real- 
istic to give us an accurate PDF for lens candidate classi- 
fication, although a larger sample of known lenses would 
assist here. For example, the HST/ AGS images taken 
during the SLACS survey would pro vide a training set o f 
some 70 galaxy-galaxy strong lenses (jBolton et al.ll2008l ). 
However, we must be careful to match the robot's train- 
ing set with the kinds of lenses expected to be found in 
high resolution imaging data. The selection function of 
the SLACS lenses is quite different, favoring low redshift 
lens galaxies and high magnification image configurations 



(Einstein rings). While it may be argued that the latter 
is a desirable property of the target lenses, if we seek 
to find all lenses we need to educate the robot accord- 
ingly. One option for future work would be to use the 
entire ECS survey (as well as the simulated lenses) as a 
training set - although some care may be needed when 
applying the robot to new data of different depth and 
resolution. If our training set is assumed adequate, then 
two sources of incompleteness and inefficiency remain. 
One is the treatment of the training set, and the other is 
the lens modeling itself. 

6.2. More accurate PDF modeling 

It can be seen in Figures [4]-[7] that there are a few out- 
liers to the derived PDFs, indicating that these model 
distributions are a somewhat lossy compression of the in- 
formation in the training set. One way of correcting this 
would be to increase the complexity, and therefore inclu- 
siveness, of the PDF models to reduce the number of out- 
liers. However, the problem of degeneracy between the 4 
different PDFs remains - this can be broken by increas- 
ing the dimensionality of the robot output data-space. 
While an exhaustive investigation of the individual cases 
is beyond the scope of this paper, we make the following 
general suggestion for future work. Most of the bright 
red objects passed to the robot for lens-modeling are 
not massive elliptical galaxies; these should give complex 
residual images even when the lensing-consistent fiux is 
subtracted. Some quantitative morphology analysis of 
such maps might produce a useful extra dimension to 
use in ruling out non-lenses. Likewise, the pre-selection 
of BRG candidates could be improved, using some mea- 
sure of concentration to favor the massive galaxies. 

One disadvantage of the approach taken here is that 
we treat the four classifications {H = 0,1,2,3) as exclu- 
sive and unrelated "bins" into which lens candidates fall; 
that is, we make no use of the fact that there is really 
a continuum from H = to H = 3, and that H = 1 is 
between H ^ and H ~ 2, and that H ~ is very far 
from _ff = 3. A more sophisticated approach would make 
use of this continuity information, perhaps by working 
with the five-dimensional distribution of the four scalars 
and H; that is, treating as a fifth scalar to be predicted 
using the other four. A larger training set, or a training 
set classified by a larger number of humans, could also 
effectively increase the granularity of the iJ-statistic and 
provide a better basis for treating the classification as a 
continuum rather than a discontinuous set of exclusive 
and unrelated bins. 

6.3. More accurate lens modeling 

The lens modeling itself could be made cleaner at the 
same time as improving this pre-seleetion: fitting the 
BRG light with a bulge-|-disk light profile may better sup- 
press the symmetrical disk-like residuals that can mimic 
lensed images, while providing some quantitative esti- 
mate of galaxy type (and hence mass). In the future, 
with surveys in many filters we can hope to extend this 
modeling to include the photometric redshift and stellar 
population properties of the BRG, and use the funda- 
mental plane to estimate the BRG mass directly. 

While the above desiderata may improve the efficiency 
of the lens search, they all favor smooth, clean lens galax- 
ies, and sparsely populated source planes. This approach 
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Fig. 11. — Automated lens search candidates in the EGS survey area. Candidates are sorted (left to right, top to bottom) by their 
human classification parameter, H - all 10 objects with H > 1.5 are shown. Both human and robot classification parameter (H,-) arc shown 
overlaid on the images, with color scheme green H = 3, amber —tH = 2 and red —* H = 1. All cutout images are 6arcsec on a side. 



TABLE 6 

Robot outputs for the 10 lens candidates in Figure [TT] 



Object name 


Position in 




9e 






Source magnitude 


Robot class 


Human class 




Figure [TTl 




(arcscc) 


(aresec] 




AB, F606W 


(Hr) 


(H) 


HSTJ141735. 73+522646. 3 


(1,1) 


13.23 


0.96 


0.07 


0.60 


30.42 


1.87 


3.00 


HSTJ141820. 84+523611. 2 


(2,1) 


4.08 


0.66 


0.14 


0.69 


27.92 


1.62 


3.00 


HSTJ141833. 11+524352.5 


(3,1) 


4.11 


0.84 


0.06 


0.62 


28.40 


2.86 


3.00 


HSTJ141759. 01+523514. 7 


(4,1) 


5.07 


0.96 


0.06 


0.57 


28.84 


2.72 


2.33 


HSTJ142053. 90+530608.1 


(5,1) 


7.19 


0.54 


0.04 


0.31 


29.56 


1.82 


2.33 


HSTJ141719. 80+522824. 3 


(1,2) 


-1.18 


0.60 


0.04 


0.75 


29.62 


2.97 


2.20 


HSTJ141807. 31+523030.0 


(2,2) 


1.69 


0.66 


0.16 


1.00 


29.29 


2.00 


2.00 


HSTJ141857. 11+523824.4 


(3,2) 


99.00 


0.00 


99.00 


0.00 


99.00 


0.00 


1.67 


HSTJ141958. 53+530123. 9 


(4,2) 


1.21 


0.66 


0.07 


0.48 


27.96 


2.71 


1.67 


HSTJ141731. 17+522636.2 


(5,2) 


21.31 


1.08 


0.07 


0.59 


28.75 


1.51 


1.60 



is somewhat justified for some of our strong lensing sci- 
ence goals, i.e. those that require samples of massive 
regular elliptical galaxies whose lensing effects are eas- 
ily modeled. In fact, one of the attractive aspects of 
the future imaging surveys is that they will be capable 
of sampling further down the lensing cross-section dis- 
tribution, to where the lens galaxies are higher redshift, 
and/or less bulge-dominated. We have shown that our 
approach can deal with massive galaxies that do have, 
for example, disk components, but at the cost of enlarg- 
ing the size of the visual inspection candidate list. If we 
want to find more exotic disky or complex lenses, then 
the robot's modeling capabilities must be increased ac- 
cordingly. The pre-selection can likely be made much 
more stringent (e.g. selecting close pairs of BRGs) - in 
which case a purely visual inspection may be the most 
effective strategy. However, we already see in the case of 
the EGS lenses that more than three lens model parame- 
ters are required for a good fit to the data: more work is 
required on making such flexible lens fits feasible in the 
available CPU time. 

An unwelcome side-effect of poor lens-modeling is 
that the output parameters may not be useful for some 
other desirable follow-on work. For example, the source 
plane photometry performed in this work is not accu- 
rate enough to estimate the photometric redshift of the 
source. This should probably not be a concern - even 



10* new lenses, once found in the SNAP survey for ex- 
ample, could be re-modeled straightforwardly given the 
computing power assumed in Table 21 

Finally, it is worth revisiting the assumptions made in 
our lens modeling algorithm. Perhaps the most impor- 
tant is our neglect of the PSF. For the high resolution 
image surveys we have restricted ourselves to, we expect 
this not to be a problem for the more numerous extended 
galaxy-source lenses. However, we should not be sur- 
prised to find the robot failing to detect lensed quasars, 
especially if they are very bright. A more advanced lens 
modeler would have to incorporate some form of decon- 
volution; the most stable way to do this is to work with 
a model source and infer its parameters by predicting 
the image plane. However, this will add parameters to 
the model and prevent us using the efficient "minimum- 
filtering" scheme. Nevertheless, detecting bright lensed 
quasars would require the PSF to be taken into account 
properly, as would extending our approach to ground- 
based data, with its larger PSF width to source size ra- 
tio. 



6.4. A more objective classification scheme 

In order to bypass the more straightforwardly inferred 
human classification parameter, and instead answer the 
question "Is this a gravitational lens or not?" we would 
have to do the following: for each conceivably massive. 
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distant galaxy in a large imaging survey, search the entire 
space of all reasonable models for that galaxy's lensing 
potential, and search the entire space of reasonable dis- 
tributions of resolved and unresolved background sources 
in angle, redshift, and color, looking for a model to ex- 
plain the morphology observed in and around the galaxy 
image. If, in this enormous space, there is a reason- 
able model for the potential and a reasonable model for 
the background sources that explains a significant and 
not fine-tuned portion of the intensity in and around the 
image of the lensing galaxy, then that galaxy is a very 
strong candidate for a multiply- imaging lens. The null 
model against which this would be compared would be 
that where all features in the elliptical-subtracted im- 
age are assumed unlensed. The key point is that the 
lensing hypothesis is potentially simpler for comparable 
(and perhaps better) goodness of fit, since fewer individ- 
ual sources need to be fitted. 

In practice, it is not currently possible to implement 
this scheme in full. As previously discussed, at present 
it is not practical to perform fully general lens model- 
ing for every object. Perhaps more importantly, there is 
also currently no reliable way to determine which parts 
of the image of a putative lensing galaxy are part of that 
galaxy (or foreground) and which originate from back- 
ground sources. For this we would need a comprehensive 
understanding of galaxy morphology quantified as a com- 
plex joint prior for the morphology parameters. While 
there are pro mising signs of such a formalism being de- 
veloped fe.g. iLotz erall 120041 : Oviassev et al1l2004a[ ). we 
are not there yet. This means that, at present, there is no 
quantitative meaning to the words "reasonable," "signif- 
icant," and "not fine-tuned," and so the usual evidence 
ratio used for model selection is not available to us. 

Nonetheless, the method we have described here repre- 
sents a first attempt at the model-oriented lens searching 
scheme, including a number of necessary approximations 
and simplifications. Its extension to more powerful mod- 
els and fitting algorithms is straightforward, and we leave 
this to future work. The quantification of non-lens galaxy 
morphology presents a greater challenge. 

6.5. Comparison with other automated lens detection 
algorithms 

Most of the automated lens detection schemes pro- 
posed to date have focused o n finding curved arc- 
like s t ructures (ILenzen et al . 2005: S eidel fc BartelmannI 
[2007t lAlaTdl [2007riKubo fc DeU'Antoniol 120081 ). These 
have the benefit of finding lenses by their sources, not 
their lens galaxies - an imp ortant distinction for lens- 
statistics studies (e.g. lKeeton 2002; Kochanek 2006) and 
for finding dark gravitational lenses. While none of these 
methods rely on the subtraction of the lens galaxy light 
before applying the algorithms, it would seem profitable 
to do so when searching for galaxy-scale lenses. All in- 
clude a final human inspection step to provide quality 
control. 

Th e Ringfinder algorith m (Gavazzi et al in prepara- 
tion, ICabanac et al.ll2007h does include the subtraction 
of the lens light: as this is modeled by rescaling the red- 
dest available image, the method is restricted to multi- 
filter data. It was designed for the CFHT legacy survey 
ground-based data, and so represents the first attempt at 
specifically finding galaxy-scale lenses robotically, from 



the ground. While the analysis of the blue residuals is 
more ad-hoc, the results could in principle be trained in 
the same way we have described here. 

The arc-detection algorithm used by lEstrada et al.l 
(|2007l ) in searching the SDSS images shares a key fea- 
ture with our robot: they use a neural network to assess 
their (different) 4 data that describe each candidate arc. 
The probabilistic framework presented here can itself be 
viewed as a simple machine-learning process. Although 
crude (and in fact, hand-made), our framework does have 
the important benefit of providing some the insight into 
the problem. A fruitful line of future enquiry could be to 
replace our robot's PDF (with its dependence on a lim- 
ited data vector d) with a neural network; the natural 
extension of this would then be to increase the number of 
inputs to be the pixels of the image itself. This approach 
would perhaps solve the problem of the non-lens galaxy 
morphology as well. 

7. CONCLUSIONS 

We have developed a novel approach to the automated 
detection and classification of strong galaxy-scale gravi- 
tational lenses in high resolution imaging surveys. After 
training our software robot on simulated and real HST 
data we draw the following conclusions: 

• For high resolution data and sufficiently faint and 
extended images, we can neglect the PSF and re- 
duce the complexity of the lens model to three 
parameters, generating the unlensed source plane 
by a simple and robust ray-tracing and minimum- 
filtering scheme. In this way, our robot is able to 
return crude lens models that predict the images 
seen in both simulated and real galaxy-scale lenses. 

• While the Einstein radius and source magnitude 
returned by the robot are not yet accurate enough 
for further use, improvements in the lens mod- 
eling should make these parameters useful. The 
automated nature of the detection process means 
that the selection function is well-defined, such that 
measuring e.g. dn/dO^ should be meaningful. 

• Using a set of data derived from the lens model, we 
infer for each candidate the classification parame- 
ter H that a human inspector would have assigned 
it. This is a well-defined procedure that can be 
calibrated straightforwardly using a large sample 
of simulated lenses and known non-lenses; the cali- 
bration information is compressed as a set of PDFs 
whose estimation comprises the "training" of the 
robot. While some systems in the training set re- 
main as outliers to the model distribution, this does 
not have a catastrophic effect on the automated 
classification. 

• The completeness and purity of any survey are 
partly determined by the prior PDF on the clas- 
sification parameter, in our case Pt{H). A realis- 
tic prior distribution of -ff- values heavily favors the 
classification H ^ (on the grounds that they are 
known to be much more common), and predicts 
higher class objects to be correspondingly rare: it 
makes the search efficient, with any loss of com- 
pleteness being due to the inadequacies of the mod- 
eling process. Setting a threshold of > 2.5, 
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wc find a purity of ~ 100% at a completeness of 
~ 20%. 

• We can choose to improve the completeness at the 
expense of the purity and efficiency by changing 
the prior PDF, which is the analog of employing 
a classifier of different character. A more "opti- 
mistic" classifier would be happy to see many more 
high class objects, and at least at present this seems 
necessary to achieve 100% completeness. The price 
of high completeness is a low rejection rate: with 
the optimistic robot threshold set to > 2.5, we 
find a rejection rate of ~ 90% at a completeness of 
- 90%. 

• A realistic robot may be most appropriate for fu- 
ture large imaging surveys where human inspection 
is costly. A 1000 square degree survey with a space 
telescope such as SNAP would contain ~ 10^ BRGs 
and ~ 10** lenses; the current realistic robot's sam- 
ple would comprise 2000 of these, with no inspec- 
tion required. 

• An optimistic robot is more appropriate for 
present-day lens searches, where the numbers are 
small enough for human inspection to be cheap, 
and where every new lens counts. A search area 
of 1 square degree, such as that provided by the 
HST/ ACS archive, would lead to an optimistic 
robot sample of ~ 5000 candidates, with the hu- 
man classification taking ~ 1 day. 

• Applying the optimistic robot to the EGS survey, 
we recovered all three human class-3 lenses, and all 
but one of the three human class-2 lens candidates 
from M06 and MOT. We also discovered four new 
human class-2 lens candidates. 

At the time of writing, the era of wide-field imaging 
from space is still 1 decade away. Continuing to train 



our software robot on HST da,ta,, we should be optimistic 
about the prospects of a feasible search generating a well- 
defined statistical sample of galaxy-scale strong lenses 
from an imaging survey like that of SNAP. The approach 
we have described here is maximally informative, in that 
it incorporates our expectations about the typical lenses 
in the universe. However, perhaps the biggest challenge 
will be discovering the more complex and unexpected 
strong lenses yet to be seen. 
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